DAL: A Locality-Optimizing Distributed Shared Memory System
نویسندگان
چکیده
Latency-sensitive applications like virtualized telecom and industrial IoT systems require a service for ultrafast state externalization to become cloud-native. In this paper we propose a distributed shared memory system, called DAL, which achieves the lowest possible latency by transparently co-locating individual data items with applications working on them. Upon changes in data access patterns, the system automatically adapts data locations to keep the number of remote operations at a minimum. By avoiding the costs of network transport and using shared memory communication, the system can achieve 1 μs data access latency. We envision DAL as a platform component which enables latency-sensitive applications to take advantage of the cloud.
منابع مشابه
A Linear Algebra Framework for Automatic Determination of Optimal Data Layouts
This paper presents a data layout optimization technique for sequential and parallel programs based on the theory of hyperplanes from linear algebra. Given a program, our framework automatically determines suitable memory layouts that can be expressed by hyperplanes for each array that is referenced. We discuss the cases where data transformations are preferable to loop transformations and show...
متن کاملExploiting Spatial Store Locality Through Permission Caching in Software DSMs
Fine-grained software-based distributed shared memory (SWDSM) systems typically maintain coherence with in-line checking code at load and store operations to shared memory. The instrumentation overhead of this added checking code can be severe. This paper (1) shows that most of the instrumentation overhead in the fine-grained SW-DSM system DSZOOM is store-related, (2) introduces a new write per...
متن کاملEvaluation, Implementation and Performance of Write Permission Caching in the DSZOOM System
Fine-grained software-based distributed shared memory (SWDSM) systems typically maintain coherence with in-line checking code at load and store operations to shared memory. The instrumentation overhead of this added checking code can be severe. This paper (1) shows that most of the instrumentation overhead in the fine-grained DSZOOM SW-DSM system is store related, (2) introduces a new write per...
متن کاملUnderstanding the Behavior of Shared Memory Applications Using the SMiLE Monitoring Framework
Data locality is a key factor for the performance of parallel systems. In a Distributed Shared Memory (DSM) system, however, it is difficult for the users to maintain a high data locality as it is usually a priori unknown how the data is distributed among the nodes. In this article we introduce a monitoring framework that allows users to understand the memory behavior of parallel applications. ...
متن کاملEarly Experience with Profiling and Optimizing Distributed Shared Cache Performance on Tilera’s Tile Processor
This paper describes our experience with profiling and optimizing physical locality for the distributed shared cache (DSC) in Tilera’s Tile multicore processor. Our approach uses the Tile Processor’s hardware performance measurement counters (PMCs) to acquire page-level access pattern profiles. A key problem we address is imprecise PMC interrupts. Our profiling tools use binary analysis to corr...
متن کامل